COST OF LIVING IN THE UK: INSIGHTS VIA AI, DATA VISUALIZATION, AND STORYTELLING¶

By Amin Sabbagh
Student ID: 18055392
EthOS Reference Number: 64001

Intro | Linear Regression Models | Further Insight | Data Preprocessing Intro | Data Preprocessing Continuous | Data Preprocessing Insights

Imported Libraries¶

  • Pandas: Used to extract data from Excel into dataframes

  • NumPy: Used for numerical computing and array operations.

  • Matplotlib.pyplot: Used for creating machine-learning data visualizations in Python.

  • Plotly Express: Used to create interactive data visualizations.

  • SciPy.stats: Provides functions for statistical computations and hypothesis testing.

  • Seaborn: A data visualization library for drawing attractive statistical graphics.

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import plotly.express as px
from scipy import stats
import seaborn as sns

Linear Regression models¶

In [2]:
# Specify the sheet name from which to read data
file_name = "COL.xlsx"
sheet_name = 'Sheet1'

df = pd.read_excel(file_name, sheet_name, usecols='C, D, E, F, G, H, I')

df
Out[2]:
CPI Food Inflation Resturants Inflation Rental Price House Price Motor Fuel Price Gas inflation
0 0.6 -2.6 1.8 2.6 7.8 -7.3 -6.0
1 0.6 -2.2 1.7 2.6 7.7 -7.3 -6.0
2 0.8 -2.8 1.9 2.6 8.4 -9.2 -6.0
3 0.7 -2.5 2.1 2.6 7.9 -7.5 -7.3
4 0.7 -2.8 2.2 2.5 8.0 -6.8 -6.7
... ... ... ... ... ... ... ...
91 6.3 13.6 8.8 5.6 -0.4 -16.4 1.7
92 6.3 12.2 9.1 5.7 -1.4 -9.7 1.7
93 4.7 10.1 8.8 6.1 -1.3 -7.6 -31.0
94 4.2 9.2 8.2 6.2 -2.3 -10.6 -31.0
95 4.2 8.0 7.7 6.2 -1.4 -10.8 -31.0

96 rows × 7 columns

In [3]:
food = np.array(df['Food Inflation'])
resturants = np.array(df['Resturants Inflation'])
rent = np.array(df['Rental Price'])
house = np.array(df['House Price'])
fuel = np.array(df['Motor Fuel Price'])
gas = np.array(df['Gas inflation'])


y = np.array(df['CPI'])

Linear Regression Models for Food and Resturant Inflation¶

Food Inflation¶

The linear regression model for food reveals a strong relationship between CPI and food inflation, as illustrated by the close alignment of data points with the regression line.¶

  • Key Findings:
    • Changes in food inflation are strongly associated with corresponding changes in CPI.
    • The observed correlation coefficient of 0.93 indicates a robust and positive relationship between these variables.
    • The coefficients of determination with a value of 0.89 indicate a strong level of fit, suggesting that food inflation is a significant determinant of fluctuations in the overall cost of living.
In [4]:
slope_food, intercept_food, r_food, p_food, std_err_food = stats.linregress(food, y)

def foodFunc(x):
  return slope_food * x + intercept_food

foodModel = list(map(foodFunc, food))

plt.scatter(food, y)
plt.plot(food, foodModel)
plt.xlabel('Consumer Price Inflation')
plt.ylabel('Food Inflation')
plt.title('Linear Regression Model for Food Inflation')
plt.show()

Restaurant Inflation¶

The analysis also indicates a strong relationship between CPI and restaurant inflation, as evidenced by the close alignment of data points with the regression line.¶

  • Key Findings:
    • The correlation coefficient of 0.91 suggests a strong and positive relationship between CPI and restaurant inflation.
    • Coefficients of determination reaching 0.83 indicate that changes in restaurant inflation are closely associated with fluctuations in CPI.
In [5]:
slope_resturant, intercept_resturant, r_resturant, p_resturant, std_err_resturant = stats.linregress(resturants, y)

def resturantFunc(x):
  return slope_resturant * x + intercept_resturant

resturantModel = list(map(resturantFunc, resturants))

plt.scatter(resturants, y)
plt.plot(resturants, resturantModel)
plt.xlabel('Consumer Price Inflation')
plt.ylabel('Food Inflation')
plt.title('Linear Regression Model for Resturants Inflation')
plt.show()

Overall Food Inflation¶

  • Importance of Food Inflation:

    • These strong correlations underscore the critical role that overall food inflation plays in shaping consumer price dynamics in the UK.
    • Fluctuations in food prices can significantly impact overall inflation levels, individual purchasing power, and financial well-being.
  • Monitoring Food Price Trends:

    • The close alignment between CPI and overall food inflation highlights the importance of closely monitoring food price trends as a key indicator of broader economic conditions and consumer sentiment.
  • Predictive Power of Regression Model:

    • The robustness of the regression model suggests that food inflation serves as a reliable predictor of changes in CPI and vice versa.
    • This provides valuable insights for lawmakers and economists in forecasting inflationary trends and formulating appropriate policy responses.
  • Policy Implications:

    • Lawmakers can adopt targeted measures to address challenges related to food price stability and mitigate the impact of inflation on household budgets.
    • Businesses and investors can leverage this insight to make informed decisions regarding pricing strategies and investment opportunities in the food sector.

Linear Regression Model for Housing Prices¶

Rental Prices¶

The linear regression model for rent prices reveals an intriguing finding, it exhibits a curvilinear pattern rather than a linear relationship.¶

  • Key Findings:
    • Rent prices are impacted by CPI in a non-linear fashion, varying as rent prices change.
    • The observed correlation coefficient of 0.69 indicates a moderate positive relationship between rent prices and CPI.
    • Coefficients of determination reaching 0.47 suggest a median level of fit, indicating that rent prices could explain a proportion of the variance in CPI fluctuations.

Insights into Rent-CPI Relationship¶

  • Curvilinear Relationship:

    • The curvilinear relationship sheds light on the complex dynamics between rent prices and overall consumer prices.
    • It implies a more nuanced interaction, possibly due to shifts in rental demand and policy interventions.
  • Policy Implications:

    • Recognizing this non-linear nature is crucial for lawmakers and economists addressing housing affordability and inflationary pressures.
    • By understanding the nuances of this relationship, policymakers can develop targeted interventions to stabilise housing costs and mitigate the broader impact on consumer prices.
In [6]:
slope_rent, intercept_rent, r_rent, p_rent, std_err_rent = stats.linregress(rent, y)

def rentFunc(x):
  return slope_rent * x + intercept_rent

rentModel = list(map(rentFunc, rent))

plt.scatter(rent, y)
plt.plot(rent, rentModel)
plt.xlabel('Consumer Price Inflation')
plt.ylabel('Rent Prices')
plt.title('Linear Regression Model for Rent Prices')
plt.show()

House Prices¶

The linear regression model for house prices presents a scatterplot with no discernible relationship.¶

  • Key Findings:
    • The observed correlation coefficient of 0.19 suggests a weak positive relationship between house prices and CPI.
    • The coefficients of determination reaching only 0.03 indicate a very weak level of fit.
    • This weak correlation suggests that changes in house prices may not have a significant impact on overall consumer prices.
In [7]:
slope_house, intercept_house, r_house, p_house, std_err_house = stats.linregress(house, y)

def houseFunc(x):
  return slope_house * x + intercept_house

houseModel = list(map(houseFunc, house))

plt.scatter(house, y)
plt.plot(house, houseModel)
plt.xlabel('Consumer Price Inflation')
plt.ylabel('House Prices')
plt.title('Linear Regression Model for House Prices')
plt.show()

Linear Regression Model for Overall Fuel Inflation¶

The linear regression model for motor fuel inflation indicates a weak relationship, as depicted by scattered data points with no discernible patterns.¶

Motor fuel prices¶

  • Key Findings:
    • The correlation coefficient of 0.4 suggests a limited positive association between fuel inflation and CPI.
    • Coefficients of determination reaching 0.16 indicate a weak level of fit.
    • Changes in fuel prices may have a minor impact on overall consumer prices, as implied by this weak correlation.
In [8]:
slope_fuel, intercept_fuel, r_fuel, p_fuel, std_err_fuel = stats.linregress(fuel, y)

def fuelFunc(x):
  return slope_fuel * x + intercept_fuel

fuelModel = list(map(fuelFunc, fuel))

plt.scatter(fuel, y)
plt.plot(fuel, fuelModel)
plt.xlabel('Consumer Price Inflation')
plt.ylabel('fuel Inflation')
plt.title('Linear Regression Model for fuel Inflation')
plt.show()

Gas prices¶

The linear regression model for gas inflation reveals a notable relationship between CPI and gas inflation, as evidenced by the close alignment of data points with the regression line.¶

  • Key Findings:
    • The correlation coefficient of 0.87 suggests a strong level of association between gas inflation and CPI.
    • Coefficients of determination reaching 0.75 indicate a strong level of fit.
    • Changes in gas prices are considerably impacted by CPI, as implied by this strong correlation.
    • The close alignment of data points underscores an influence of gas prices by CPI fluctuations, highlighting the importance of monitoring energy costs in economic unrest, understanding broader economic trends, and inflationary pressures.
In [9]:
slope_gas, intercept_gas, r_gas, p_gas, std_err_gas = stats.linregress(gas, y)

def gasFunc(x):
  return slope_gas * x + intercept_gas

gasModel = list(map(gasFunc, gas))

plt.scatter(gas, y)
plt.plot(gas, gasModel)
plt.xlabel('Consumer Price Inflation')
plt.ylabel('gas Inflation')
plt.title('Linear Regression Model for gas Inflation')
plt.show()

Summary of Sector Impacts¶

In summary, the data reveals distinct impacts of the cost-of-living crisis across various sectors.¶

  • Key Findings:
    • Food inflation appears to be the most affected, followed by gas inflation, and then rental prices.
    • Conversely, house prices and motor fuel prices seem to be the least affected.
    • This trend is evident in both the "Correlation of Cost of Living" heatmap and the "Coefficients of determination for Cost-of-Living Crisis" bar chart.
In [10]:
plt.figure(figsize=(16, 6))
heatmap = sns.heatmap(df.corr(), vmin=-1, vmax=1, annot=True, cmap='BrBG')
heatmap.set_title('Correlation of Cost Of Living', fontdict={'fontsize':18}, pad=12);
In [11]:
# the Coefficients of determination for each variable

#R^2 for food
COD_food=r_food*r_food;

#R^2 for resturant 
COD_resturant=r_resturant*r_resturant;

#R^2 for rent
COD_rent=r_rent*r_rent;

#R^2 for house
COD_house=r_house*r_house;

#R^2 for fuel
COD_fuel=r_fuel*r_fuel;

#R^2 for gas
COD_gas=r_gas*r_gas;
In [12]:
categories = ['Food', 'Restaurant', 'Gas', 'Rent', 'Fuel', 'House']
r_squared_values = [COD_food, COD_resturant, COD_gas, COD_rent, COD_fuel, COD_house]

data = pd.DataFrame({'Category': categories, 'R-squared': r_squared_values})

fig = px.bar(data, x='Category', y='R-squared', title='Coefficients of determination Values for Cost of Living Crisis',
             labels={'R-squared': 'R-squared'},
             hover_data={'R-squared': ':.2f'},
             color='R-squared',
             color_continuous_scale='Blues')

fig.update_layout(yaxis_range=[0, 1])

fig.show()